A General Technique to Train Language Models on Language Models

نویسنده

  • Mark-Jan Nederhof
چکیده

We show that under certain conditions, a language model can be trained on the basis of a second language model. The main instance of the technique trains a finite automaton on the basis of a probabilistic context-free grammar, such that the Kullback-Leibler distance between grammar and trained automaton is provably minimal. This is a substantial generalization of an existing algorithm to train an n-gram model on the basis of a probabilistic context-free grammar.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Myhill-Nerode Fuzzy Congruences Corresponding to a General Fuzzy Automata

Myhill-Nerode Theorem is regarded as a basic theorem in the theories of languages and automata and is used to prove the equivalence between automata and their languages. The significance of this theorem has stimulated researchers to develop that on different automata thus leading to optimizing computational models. In this article, we aim at developing the concept of congruence in general fuzzy...

متن کامل

On the Iranian In-service and Pre-service Language Teachers’ Perceptions of Educational Supervision Concerning their Professional Development

Teacher supervision plays a pivotal role in the improvement of education system and the way in which teachers and student teachers perceive it. Consequently language teacher supervisors can utilize appropriate supervisory models to keep teachers update and promote them professionally. The present study investigated the role of language teacher supervisors in student teachers and in-service teac...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Classification of Chest Radiology Images in Order to Identify Patients with COVID-19 Using Deep Learning Techniques

Background and Aim: Due to the important role of radiological images for identifying patients with COVID-19, creating a model based on deep learning methods was the main objective of this study. Materials and Methods: 15,153 available chest images of normal, COVID-19, and pneumonia individuals which were in the Kaggle data repository was used as dataset of this research. Data preprocessing inc...

متن کامل

Non-Verbal Communication in Models of Communicative Competence and L2 Teachers’ Rating

Non-verbal communication (NVC) plays a major role in various aspects of human life (Andersen, 2004; Cameron, 2001; Johnstone, 2008). Children learning their first language come to realize non-verbal communication as their socialization process takes place (Fletcher & German, 1990; Ingram, 1996; Owens, 2001). However, most EFL learners may have little exposure to these non-verbal aspects of comm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2005